Giorgia Crosilla

DHDK graduate at UniBo

I obtained a Master's Degree in Digital Humanities and Digital Knowledge at University of Bologna in 2025 and a BA in Cultural Heritage from the University of Udine in 2022. Between 2023 and 2024, I collaborated with the Bibliotheca Hertziana on developing a website for the digital edition of Heinrich Wölfflin's works. During my thesis research internship at I Tatti, I conducted research on Large Language Models (LLMs) for deciphering handwritten text, focusing on benchmarking multimodal LLMs for Handwritten Text Recognition (HTR) with multilingual datasets, including the Belle Greene dataset. Additionally, I used LLMs to extract entities from Mary Berenson's diaries, integrating this information into RDF following OA schema. Between 2024 and the beginning of 2025, I collaborated with FBK - Fondazione Bruno Kessler for a digitization project and script development involving historical journals.

LOD it be

"LOD it be" is the final project for the exam of "Knowledge Organization" and Cultural Heritage.

Website Repository

Data Science Project

Repository

CENA

CENA (Capellini Encounters Native Americans) is an interactive experience created ad hoc for the Capellini Museum in Bologna.

Website Repository

MetaScript

"Metascript" is an open source tool that provides a basic textual analysis framework for the comparison of an original text (novels, short stories, etc.) and its film transposition (screenplays). The goal of Metascript is to provide a set of guidelines for the markup and the metadata enrichment among multimodal texts, such as a book with its screenplay adaptation for audiovisual transposition.

Website Repository

SecretMag

"SecretMag" is the final project for the exam of "Information Modeling and Web Technologies".

Website Repository

BERTolde

BERTolde is the final project for the exam of "Natural Language Processing". This project involves fine-tuning and evaluating the performance of four different models—BERT, RoBERTa, GilBERTo, and UmBERTo—for Named Entity Recognition in the Italian language starting from the EVALITA 2009 dataset which contains journal articles of "L'Adige".

Repository

Cervelli in Fuga

"Cervelli in Fuga" is the final project for the exam "Open Access and Digital Ethics".

Website Repository

Cognitive Bias Ontology

This is the final project for the exam "Knowledge Organization and Extraction".

Website Repository

Deciphering Aldrovandi's copyists handwriting using Transkribus

Final project for the Semantic Digital Libraries exam on the development of two models for the automatic transcription of some of Aldrovandi's manuscripts.

Website Repository

Reputation Era

Final project for the Information Visualisation exam on the analysis and visualization of how artists' reputation can change over a time span.

Website Repository

Benchmarking Multimodal Large Language Models for Handwritten Text Recognition

Article extracted from my Master's degree thesis.

Pre-print

Belle Greene HTR dataset

The Belle Greene Dataset is a collection designed for full-page handwritten text recognition tasks. It includes 4,135 digitized images of 605 letters written by Belle Greene to Bernard Berenson between 1909 and 1948. Accompanying these images are ground-truth semi-diplomatic transcriptions in .txt and .xml formats, along with metadata extracted from the letters.

Zenodo repository

Semantic Annotator Extractor

In this project, LLMs (Claude Sonnet 3.5 Latest and GPT-4) are used to generate complete OA RDF annotations for places and dates extracted from diary pages, as well as to disambiguate places' Wikidata IDs.

GitHub repository